Cluster Forests
نویسندگان
چکیده
With inspiration from Random Forests (RF) in the context of classification, a new clustering ensemble method—Cluster Forests (CF) is proposed. Geometrically, CF randomly probes a high-dimensional data cloud to obtain “good local clusterings” and then aggregates via spectral clustering to obtain cluster assignments for the whole dataset. The search for good local clusterings is guided by a cluster quality measure kappa. CF progressively improves each local clustering in a fashion that resembles the tree growth in RF. Empirical studies on several real-world datasets under two different performance metrics show that CF compares favorably to its competitors. Theoretical analysis reveals that the kappa measure makes it possible to grow the local clustering in a desirable way—it is “noiseresistant”. A closed-form expression is obtained for the mis-clustering rate of spectral clustering under a perturbation model, which yields new insights into some aspects of spectral clustering.
منابع مشابه
Population variation of Artemisia sieberi in Iran based on quantitative characters of leaf and seed and their relationships with habitat features
Thirty-four populations of Artemisia sieberi from 10 provinces of Iran were investigated with respect to quantitative characteristics of leaves and seeds. In each habitat, five plants were randomly selected and some branches were harvested for studying leaf characteristics in spring and seed characteristic in autumn. Principle features of climate and soil were studied in each habitat. In order ...
متن کاملTrees, forests and jungles: a botanical garden for cluster expansions
Combinatoric formulas for cluster expansions have been improved many times over the years. Here we develop some new combinatoric proofs and extensions of the tree formulas of Brydges and Kennedy, and test them on a series of pedagogical examples.
متن کاملClassification of Russia’s Forests in Relation to Global Climate Warming
This study involves investigating the sensitivity to temperature of Russia’s forest communities. Factors taken into consideration were mean annual temperature; standard deviation and temperature tolerance limits covering forests across the country. A new numerical classification of forest, related to predicted global climate warming (GCW) has been developed based on cluster analyses. New temper...
متن کاملMorphotypes of Tilia spp. fruits in Hyrcanian forests
Variations in fruits morphology of Tilia spp. were studied along longitudinal and altitudinal range. Ten populations of Tilia spp.from Golestan, Mazandaran and Gilan Provinces (N. Iran) were selected. After biometrical measurements and Scanning Electron Microscopic (SEM), Cluster Analysis divided the population into seven groups. Discriminat Analysis showed that among all traits studied, size o...
متن کاملDetection and Extinguishing Forest Fires using Wireless Sensor and Actor Networks
Forests are rich containment of resources and they play a vital role in preserving and maintaining the environment. The major hazard of the forests is forest fire as the consequence of the forest fire is catastrophic in nature. Hence it is of great importance that forest fire occurrence must be detected and extinguished before the fire spreads and destroys the resources. A lot of detection mech...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computational Statistics & Data Analysis
دوره 66 شماره
صفحات -
تاریخ انتشار 2013